TASK 1 and TASK 2

In these tasks, we are asked to choose 5 bookmakers and and apply PCA with their odds data.

For example, for Pinnacle, the first two components capture 92% of the total variance.

Also, away win odd is the feature that captures the variability the most.

## Warning: package 'dplyr' was built under R version 3.4.4
## 
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
## 
##     filter, lag
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
## Loading required package: data.table
## Warning: package 'data.table' was built under R version 3.4.4
## 
## Attaching package: 'data.table'
## The following objects are masked from 'package:dplyr':
## 
##     between, first, last
## Importance of components:
##                           Comp.1     Comp.2      Comp.3      Comp.4
## Standard deviation     0.1825755 0.05393369 0.012518614 0.006318502
## Proportion of Variance 0.9139673 0.07975649 0.004296931 0.001094647
## Cumulative Proportion  0.9139673 0.99372380 0.998020727 0.999115374
##                              Comp.5       Comp.6       Comp.7
## Standard deviation     0.0055127591 1.365384e-03 9.428989e-05
## Proportion of Variance 0.0008332664 5.111586e-05 2.437677e-07
## Cumulative Proportion  0.9999486404 9.999998e-01 1.000000e+00
## 
## Loadings:
##                    Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
## odd1_Pinnacle_NA    0.194  0.916  0.240  0.164 -0.192              
## odd2_Pinnacle_NA   -0.957         0.166  0.132 -0.173              
## oddX_Pinnacle_NA   -0.214  0.371 -0.387 -0.422  0.690  0.113       
## over_Pinnacle_0.5                        0.108               -0.992
## over_Pinnacle_2.5                 0.668 -0.214  0.390 -0.590       
## under_Pinnacle_0.5                      -0.848 -0.506        -0.127
## under_Pinnacle_2.5               -0.559        -0.214 -0.796       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.143  0.143  0.143  0.143  0.143  0.143  0.143
## Cumulative Var  0.143  0.286  0.429  0.571  0.714  0.857  1.000

## Importance of components:
##                           Comp.1     Comp.2     Comp.3      Comp.4
## Standard deviation     0.1390548 0.08641774 0.04472684 0.009121567
## Proportion of Variance 0.6681044 0.25803472 0.06912080 0.002874824
## Cumulative Proportion  0.6681044 0.92613914 0.99525994 0.998134766
##                             Comp.5       Comp.6       Comp.7       Comp.8
## Standard deviation     0.006217179 0.0028650498 0.0023409420 1.222277e-03
## Proportion of Variance 0.001335547 0.0002836199 0.0001893449 5.161925e-05
## Cumulative Proportion  0.999470313 0.9997539325 0.9999432774 9.999949e-01
##                              Comp.9
## Standard deviation     3.843178e-04
## Proportion of Variance 5.103326e-06
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_Betway_NA                    0.109 -0.300 -0.462 -0.417  0.359  0.616
## YES_Betway_NA                  -0.103  0.293  0.391  0.195 -0.325  0.778
## odd1_Betway_NA   -0.234  0.505 -0.772  0.196 -0.231                     
## odd2_Betway_NA    0.799 -0.373 -0.368  0.186 -0.224                     
## oddX_Betway_NA    0.208  0.103 -0.337 -0.701  0.552         0.190       
## over_Betway_0.5                                     -0.144              
## over_Betway_2.5                        0.307  0.414 -0.834        -0.108
## under_Betway_0.5  0.507  0.763  0.362  0.153                            
## under_Betway_2.5                      -0.378 -0.217 -0.261 -0.850       
##                  Comp.9
## NO_Betway_NA           
## YES_Betway_NA          
## odd1_Betway_NA         
## odd2_Betway_NA         
## oddX_Betway_NA         
## over_Betway_0.5  -0.986
## over_Betway_2.5   0.135
## under_Betway_0.5       
## under_Betway_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     0.1823609 0.1021055 0.04409562 0.007483578
## Proportion of Variance 0.7272582 0.2279942 0.04252224 0.001224740
## Cumulative Proportion  0.7272582 0.9552524 0.99777462 0.998999359
##                              Comp.5       Comp.6       Comp.7       Comp.8
## Standard deviation     0.0056889942 0.0028326541 1.753611e-03 1.494435e-03
## Proportion of Variance 0.0007077767 0.0001754738 6.724989e-05 4.884043e-05
## Cumulative Proportion  0.9997071360 0.9998826098 9.999499e-01 9.999987e-01
##                              Comp.9
## Standard deviation     2.438038e-04
## Proportion of Variance 1.299889e-06
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                  Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_Unibet_NA                    0.115 -0.349 -0.358 -0.492  0.380  0.591
## YES_Unibet_NA                  -0.104  0.348  0.262  0.197 -0.373  0.788
## odd1_Unibet_NA           0.576 -0.761        -0.272                     
## odd2_Unibet_NA    0.608 -0.613 -0.419        -0.266                     
## oddX_Unibet_NA    0.208        -0.318 -0.489  0.749         0.225       
## over_Unibet_0.5                                                         
## over_Unibet_2.5                        0.574  0.309 -0.732        -0.136
## under_Unibet_0.5  0.759  0.535  0.343  0.136                            
## under_Unibet_2.5                      -0.394        -0.412 -0.811       
##                  Comp.9
## NO_Unibet_NA           
## YES_Unibet_NA          
## odd1_Unibet_NA         
## odd2_Unibet_NA         
## oddX_Unibet_NA         
## over_Unibet_0.5  -0.995
## over_Unibet_2.5        
## under_Unibet_0.5       
## under_Unibet_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     0.1515605 0.0742054 0.03799709 0.007034968
## Proportion of Variance 0.7652335 0.1834395 0.04809748 0.001648717
## Cumulative Proportion  0.7652335 0.9486730 0.99677046 0.998419173
##                             Comp.5       Comp.6       Comp.7       Comp.8
## Standard deviation     0.005917478 0.0026733107 0.0018943852 0.0012914873
## Proportion of Variance 0.001166528 0.0002380788 0.0001195524 0.0000555651
## Cumulative Proportion  0.999585701 0.9998237797 0.9999433321 0.9999988972
##                              Comp.9
## Standard deviation     1.819412e-04
## Proportion of Variance 1.102768e-06
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_bwin_NA                   -0.148  0.299 -0.390 -0.564  0.280  0.583
## YES_bwin_NA                   0.138 -0.216  0.302  0.358 -0.240  0.810
## odd1_bwin_NA    0.214 -0.662  0.652  0.291                            
## odd2_bwin_NA   -0.841  0.243  0.390  0.281                            
## oddX_bwin_NA   -0.244 -0.232  0.228 -0.799 -0.396 -0.139  0.139       
## over_bwin_0.5                                                         
## over_bwin_2.5                       -0.230  0.625 -0.708 -0.174       
## under_bwin_0.5 -0.427 -0.657 -0.565         0.246                     
## under_bwin_2.5                             -0.376 -0.147 -0.898       
##                Comp.9
## NO_bwin_NA           
## YES_bwin_NA          
## odd1_bwin_NA         
## odd2_bwin_NA         
## oddX_bwin_NA         
## over_bwin_0.5  -0.992
## over_bwin_2.5   0.109
## under_bwin_0.5       
## under_bwin_2.5       
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

## Importance of components:
##                           Comp.1    Comp.2     Comp.3      Comp.4
## Standard deviation     0.2032587 0.1386602 0.06286807 0.009788170
## Proportion of Variance 0.6390488 0.2973987 0.06113583 0.001481967
## Cumulative Proportion  0.6390488 0.9364475 0.99758331 0.999065276
##                              Comp.5       Comp.6      Comp.7       Comp.8
## Standard deviation     0.0063465469 0.0034916697 0.002023776 1.789684e-03
## Proportion of Variance 0.0006230323 0.0001885827 0.000063352 4.954368e-05
## Cumulative Proportion  0.9996883080 0.9998768907 0.999940243 9.999898e-01
##                              Comp.9
## Standard deviation     8.125913e-04
## Proportion of Variance 1.021363e-05
## Cumulative Proportion  1.000000e+00
## 
## Loadings:
##                   Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## NO_Betsafe_NA                    0.124  0.452 -0.252 -0.519         0.666
## YES_Betsafe_NA                  -0.113 -0.420  0.103         0.816  0.355
## odd1_Betsafe_NA    0.111  0.565 -0.771        -0.267                     
## odd2_Betsafe_NA   -0.636 -0.586 -0.428        -0.257                     
## oddX_Betsafe_NA   -0.203        -0.287  0.159  0.884 -0.179 -0.136  0.128
## over_Betsafe_0.5                                     -0.106              
## over_Betsafe_2.5                       -0.633        -0.679 -0.325 -0.129
## under_Betsafe_0.5 -0.732  0.573  0.329 -0.152                            
## under_Betsafe_2.5                       0.406        -0.470  0.447 -0.626
##                   Comp.9
## NO_Betsafe_NA           
## YES_Betsafe_NA          
## odd1_Betsafe_NA         
## odd2_Betsafe_NA         
## oddX_Betsafe_NA         
## over_Betsafe_0.5   0.988
## over_Betsafe_2.5        
## under_Betsafe_0.5       
## under_Betsafe_2.5 -0.120
## 
##                Comp.1 Comp.2 Comp.3 Comp.4 Comp.5 Comp.6 Comp.7 Comp.8
## SS loadings     1.000  1.000  1.000  1.000  1.000  1.000  1.000  1.000
## Proportion Var  0.111  0.111  0.111  0.111  0.111  0.111  0.111  0.111
## Cumulative Var  0.111  0.222  0.333  0.444  0.556  0.667  0.778  0.889
##                Comp.9
## SS loadings     1.000
## Proportion Var  0.111
## Cumulative Var  1.000

Comments:

It is not easy to distinguish over/under 2.5 situation by looking at the PCA results. On the other hand, home/away/tie situations can be distinguished easier in all 5 bookmakers’ PCA plots (V-shaped and colors are not distributed homogenic)

There is no significant difference between MDS with Euclidean distance and PCA results.

MDS with Manhattan distance performs better than MDS with Euclidean distance. (Sharper V-shaped)

TASK 3

Below are the image I chose, RGB channels of it, its noised version, and its channels.

## Importance of components:
##                           Comp.1     Comp.2     Comp.3
## Standard deviation     0.4529586 0.09690946 0.04991298
## Proportion of Variance 0.9452545 0.04326773 0.01147780
## Cumulative Proportion  0.9452545 0.98852220 1.00000000
## 
## Loadings:
##      Comp.1 Comp.2 Comp.3
## [1,] -0.572  0.707  0.415
## [2,] -0.587        -0.810
## [3,] -0.572 -0.707  0.415
## 
##                Comp.1 Comp.2 Comp.3
## SS loadings     1.000  1.000  1.000
## Proportion Var  0.333  0.333  0.333
## Cumulative Var  0.333  0.667  1.000
  1. I used first, second and third components of PCA consequently, to reconstruct the image. First component is better at capturing the variance, as can be seen below.

  2. I plotted first, second and third components of PCA as 3 by 3 images.